selection process
Hypothesis on the Functional Advantages of the Selection-Broadcast Cycle Structure: Global Workspace Theory and Dealing with a Real-Time World
Nakanishi, Junya, Baba, Jun, Yoshikawa, Yuichiro, Kamide, Hiroko, Ishiguro, Hiroshi
This paper discusses the functional advantages of the Selection-Broadcast Cycle structure proposed by Global Workspace Theory (GWT), inspired by human consciousness, particularly focusing on its applicability to artificial intelligence and robotics in dynamic, real-time scenarios. While previous studies often examined the Selection and Broadcast processes independently, this research emphasizes their combined cyclic structure and the resulting benefits for real-time cognitive systems. Specifically, the paper identifies three primary benefits: Dynamic Thinking Adaptation, Experience-Based Adaptation, and Immediate Real-Time Adaptation. This work highlights GWT's potential as a cognitive architecture suitable for sophisticated decision-making and adaptive performance in unsupervised, dynamic environments. It suggests new directions for the development and implementation of robust, general-purpose AI and robotics systems capable of managing complex, real-world tasks.
Improved seeding strategies for k-means and k-GMM
Carrière, Guillaume, Cazals, Frédéric
We revisit the randomized seeding techniques for k-means clustering and k-GMM (Gaussian Mixture model fitting with Expectation-Maximization), formalizing their three key ingredients: the metric used for seed sampling, the number of candidate seeds, and the metric used for seed selection. This analysis yields novel families of initialization methods exploiting a lookahead principle--conditioning the seed selection to an enhanced coherence with the final metric used to assess the algorithm, and a multipass strategy to tame down the effect of randomization. Experiments show a consistent constant factor improvement over classical contenders in terms of the final metric (SSE for k-means, log-likelihood for k-GMM), at a modest overhead. In particular, for k-means, our methods improve on the recently designed multi-swap strategy, which was the first one to outperform the greedy k-means++ seeding. Our experimental analysis also shed light on subtle properties of k-means often overlooked, including the (lack of) correlations between the SSE upon seeding and the final SSE, the variance reduction phenomena observed in iterative seeding methods, and the sensitivity of the final SSE to the pool size for greedy methods. Practically, our most effective seeding methods are strong candidates to become one of the--if not the--standard techniques. From a theoretical perspective, our formalization of seeding opens the door to a new line of analytical approaches.
- Europe > France > Provence-Alpes-Côte d'Azur (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > United Kingdom > Wales (0.04)
- Asia (0.04)
Sequential Cohort Selection
Nana, Hortence Phalonne, Dimitrakakis, Christos
We study the problem of fair cohort selection from an unknown population, with a focus on university admissions. We start with the one-shot setting, where the admission policy must be fixed in advance and remain transparent, before observing the actual applicant pool. In contrast, the sequential setting allows the policy to be updated across stages as new applicant data becomes available. This is achieved by optimizing admission policies using a population model, trained on data from previous admission cycles. We also study the fairness properties of the resulting policies in the one-shot setting, including meritocracy and group parity.
Enhancing Selection of Climate Tech Startups with AI -- A Case Study on Integrating Human and AI Evaluations in the ClimaTech Great Global Innovation Challenge
Turliuk, Jennifer, Sevilla, Alejandro, Gorza, Daniela, Hynes, Tod
This case study examines the ClimaTech Great Global Innovation Challenge's approach to selecting climate tech startups by integrating human and AI evaluations. The competition aimed to identify top startups and enhance the accuracy and efficiency of the selection process through a hybrid model. Research shows data-driven approaches help VC firms reduce bias and improve decision-making. Machine learning models have outperformed human investors in deal screening, helping identify high-potential startups. Incorporating AI aimed to ensure more equitable and objective evaluations. The methodology included three phases: initial AI review, semi-finals judged by humans, and finals using a hybrid weighting. In phase one, 57 applications were scored by an AI tool built with StackAI and OpenAI's GPT-4o, and the top 36 advanced. In the semi-finals, human judges, unaware of AI scores, evaluated startups on team quality, market potential, and technological innovation. Each score - human or AI - was weighted equally, resulting in 75 percent human and 25 percent AI influence. In the finals, with five human judges, weighting shifted to 83.3 percent human and 16.7 percent AI. There was a moderate positive correlation between AI and human scores - Spearman's = 0.47 - indicating general alignment with key differences. Notably, the final four startups, selected mainly by humans, were among those rated highest by the AI. This highlights the complementary nature of AI and human judgment. The study shows that hybrid models can streamline and improve startup assessments. The ClimaTech approach offers a strong framework for future competitions by combining human expertise with AI capabilities.
- Banking & Finance > Trading (0.48)
- Banking & Finance > Capital Markets (0.34)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.55)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.55)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)
C2RUST-BENCH: A Minimized, Representative Dataset for C-to-Rust Transpilation Evaluation
Sirlanci, Melih, Yagemann, Carter, Lin, Zhiqiang
Despite the effort in vulnerability detection over the last two decades, memory safety vulnerabilities continue to be a critical problem. Recent reports suggest that the key solution is to migrate to memory-safe languages. To this end, C-to-Rust transpilation becomes popular to resolve memory-safety issues in C programs. Recent works propose C-to-Rust transpilation frameworks; however, a comprehensive evaluation dataset is missing. Although one solution is to put together a large enough dataset, this increases the analysis time in automated frameworks as well as in manual efforts for some cases. In this work, we build a method to select functions from a large set to construct a minimized yet representative dataset to evaluate the C-to-Rust transpilation. We propose C2RUST-BENCH that contains 2,905 functions, which are representative of C-to-Rust transpilation, selected from 15,503 functions of real-world programs.
- North America > United States > Ohio (0.04)
- Asia > Middle East > Oman (0.04)
- Information Technology > Software (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.54)
A Survey for What Developers Require in AI-powered Tools that Aid in Component Selection in CBSD
Ansari, Mahdi Jaberzadeh, Barcomb, Ann
Although it has been more than four decades that the first components-based software development (CBSD) studies were conducted, there is still no standard method or tool for component selection which is widely accepted by the industry. The gulf between industry and academia contributes to the lack of an accepted tool. We conducted a mixed methods survey of nearly 100 people engaged in component-based software engineering practice or research to better understand the problems facing industry, how these needs could be addressed, and current best practices employed in component selection. We also sought to identify and prioritize quality criteria for component selection from an industry perspective. In response to the call for CBSD component selection tools to incorporate recent technical advances, we also explored the perceptions of professionals about AI-driven tools, present and envisioned.
- North America > Canada > Alberta > Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.14)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- (27 more...)
- Research Report > New Finding (1.00)
- Questionnaire & Opinion Survey (1.00)
- Research Report > Experimental Study (0.93)
Applicability of the Minimal Dominating Set for Influence Maximisation in Multilayer Networks
Czuba, Michał, Jia, Mingshan, Bródka, Piotr, Musial, Katarzyna
The minimal dominating set (MDS) is a well-established concept in network controllability and has been successfully applied in various domains, including sensor placement, network resilience, and epidemic containment. In this study, we adapt the local-improvement MDS routine and explore its potential for enhancing seed selection for influence maximisation in multilayer networks (MLN). We employ the Linear Threshold Model (LTM), which offers an intuitive representation of influence spread or opinion dynamics by accounting for peer influence accumulation. To ensure interpretability, we utilise rank-refining seed selection methods, with the results further filtered with MDS. Our findings reveal that incorporating MDS into the seed selection process improves spread only within a specific range of situations. Notably, the improvement is observed for larger seed set budgets, lower activation thresholds, and when an "AND" strategy is used to aggregate influence across network layers. This scenario reflects situations where an individual does not require the majority of their acquaintances to hold a target opinion, but must be influenced across all social circles.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > Poland > Lower Silesia Province > Wroclaw (0.04)
- Europe > United Kingdom > England > Dorset > Bournemouth (0.04)
- (5 more...)
PAIR: A Novel Large Language Model-Guided Selection Strategy for Evolutionary Algorithms
Ali, Shady, Ashraf, Mahmoud, Hegazy, Seif, Salem, Fatty, Mokhtar, Hoda, Gaber, Mohamed Medhat, Alrefaie, Mohamed Taher
Evolutionary Algorithms (EAs) employ random or simplistic selection methods, limiting their exploration of solution spaces and convergence to optimal solutions. The randomness in performing crossover or mutations may limit the model's ability to evolve efficiently. This paper introduces Preference-Aligned Individual Reciprocity (PAIR), a novel selection approach leveraging Large Language Models to emulate human-like mate selection, thereby introducing intelligence to the pairing process in EAs. PAIR prompts an LLM to evaluate individuals within a population based on genetic diversity, fitness level, and crossover compatibility, guiding more informed pairing decisions. We evaluated PAIR against a baseline method called LLM-driven EA (LMEA), published recently. Results indicate that PAIR significantly outperforms LMEA across various TSP instances, achieving lower optimality gaps and improved convergence. This performance is especially noticeable when combined with the flash thinking model, demonstrating increased population diversity to escape local optima. In general, PAIR provides a new strategy in the area of in-context learning for LLM-driven selection in EAs via sophisticated preference modelling, paving the way for improved solutions and further studies into LLM-guided optimization.
- Europe > Middle East > Cyprus > Nicosia > Nicosia (0.04)
- Asia > Japan > Honshū > Tōhoku > Iwate Prefecture > Morioka (0.04)
- Asia > Japan > Honshū > Chūbu > Toyama Prefecture > Toyama (0.04)
- Africa > Middle East > Egypt (0.04)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)
Add-One-In: Incremental Sample Selection for Large Language Models via a Choice-Based Greedy Paradigm
Li, Zhuo, Du, Yuhao, Jiao, Xiaoqi, Guo, Yiwen, Feng, Yuege, Wan, Xiang, Gao, Anningzhe, Hu, Jinpeng
Selecting high-quality and diverse training samples from extensive datasets plays a crucial role in reducing training overhead and enhancing the performance of Large Language Models (LLMs). However, existing studies fall short in assessing the overall value of selected data, focusing primarily on individual quality, and struggle to strike an effective balance between ensuring diversity and minimizing data point traversals. Therefore, this paper introduces a novel choice-based sample selection framework that shifts the focus from evaluating individual sample quality to comparing the contribution value of different samples when incorporated into the subset. Thanks to the advanced language understanding capabilities of LLMs, we utilize LLMs to evaluate the value of each option during the selection process. Furthermore, we design a greedy sampling process where samples are incrementally added to the subset, thereby improving efficiency by eliminating the need for exhaustive traversal of the entire dataset with the limited budget. Extensive experiments demonstrate that selected data from our method not only surpass the performance of the full dataset but also achieves competitive results with state-of-the-art (SOTA) studies, while requiring fewer selections. Moreover, we validate our approach on a larger medical dataset, highlighting its practical applicability in real-world applications.
- Asia > China > Guangdong Province > Shenzhen (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- Asia > China > Hong Kong (0.04)
- Asia > China > Anhui Province > Hefei (0.04)
Meta-Instance Selection. Instance Selection as a Classification Problem with Meta-Features
Blachnik, Marcin, Ciepliński, Piotr
Data pruning, or instance selection, is an important problem in machine learning especially in terms of nearest neighbour classifier. However, in data pruning which speeds up the prediction phase, there is an issue related to the speed and efficiency of the process itself. In response, the study proposes an approach involving transforming the instance selection process into a classification task conducted in a unified meta-feature space where each instance can be classified and assigned to either the "to keep" or "to remove" class. This approach requires training an appropriate meta-classifier, which can be developed based on historical instance selection results from other datasets using reference instance selection methods as a labeling tool. This work proposes constructing the meta-feature space based on properties extracted from the nearest neighbor graph. Experiments conducted on 17 datasets of varying sizes and five reference instance selection methods (ENN, Drop3, ICF, HMN-EI, and CCIS) demonstrate that the proposed solution achieves results comparable to reference instance selection methods while significantly reducing computational complexity. In the proposed approach, the computational complexity of the system depends only on identifying the k-nearest neighbors for each data sample and running the meta-classifier. Additionally, the scaling law turns into the requirement of huge compute power both during training and prediction which is not always applicable in real live scenarios where the compute resources are limited. In that case, both the training data and the prediction model should require small computing resources. Therefore the training set should ensure a possible small size but keep the prediction accuracy of the original training set. This issue is not new, along with its development has been started primarily for the nearest neighbor classifier under the name of instance selection. Thus, already in the late 1960s and early 1970s, algorithms such as Condensed Nearest Neighbor (CNN), Edited Nearest Neighbor (ENN), and many others were developed. The benchmarks of instance selection indicate the Drop3 [3] and ICF [4] algorithms as the most wildly used, which, despite not being new, are characterized by excellent properties in terms of the balance between the prediction accuracy of the kNN algorithm and the reduction of the size of the stored set of prototypes (reduction_rate) [5]. These algorithms are also applicable not only as elements of the learning process for the kNN algorithm (prototype selection as part of the learning process) and hence could also be used as universal algorithms for reducing the size of the training set for any classifier, thereby accelerating the learning process of complex predictive models, the process of finding optimal parameters, etc. Examples of such applications can be found in [6, 7] or in [5].
- North America > United States (0.04)
- Europe > Poland (0.04)
- Europe > France (0.04)
- Asia (0.04)